Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bbox bottom right corner calculation #314

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

st235
Copy link

@st235 st235 commented Jun 30, 2023

The boxes are given in (top left x, top left y, width, height) format. In order to get bottom right corner we need to add width and height to x and y correspondingly and subtract 1.

On the line 62 in tools/generate_detections.py you treat image rect correctly by subtracting 1 from the image box, which is originally (0, 0, w, h) and after subtracting (0, 0, w - 1, h - 1).

bbox[2:] = np.minimum(np.asarray(image.shape[:2][::-1]) - 1, bbox[2:])

however, you don't do it in other places across the repo, for example, on line 57 in the same file

# convert to top left, bottom right
bbox[2:] += bbox[:2]

By not doing so, you get a pixel that is diagonally to the bottom right corner. This minor inconsistency is really confusing.

There are 2 ways of how to resolve it:

  1. Fix rect calculation across the codebase (subtract 1 from other rects when converted to (left, top, right, bottom))
  2. Make image rect consistent to other rects and do not subtract 1 for the image right bottom corner

I believe the first approach is more reasonable to take as it also fixes how iou is calculated. Below is an example of how one can figure out the right bottom corner of a rect:

Screenshot 2023-06-30 at 11 24 44

This is not a critical issue and should not dramatically affect metrics though would be really nice to fix.

@st235
Copy link
Author

st235 commented Jun 30, 2023

@nwojke can you, please, take a look?

The boxes are given in (top left x, top left y, width, height)
format. In order to get bottom right corner we need to add width and
height to x and y correspondingly and subtract 1.
@st235 st235 force-pushed the fix/bbox_br_calculation branch from 01d2085 to a85801f Compare June 30, 2023 10:16
@st235
Copy link
Author

st235 commented Jul 12, 2023

@nwojke any luck you can take a look?

@st235
Copy link
Author

st235 commented Aug 7, 2023

@nwojke can you, please, take a look?

@SajjadPSavoji
Copy link

As far as I know W = x2 - x1 and H = y2 -y1.

According to that x2(right) = x1(Let) + W not "x2(right) = x1(Let) + W -1".

Maybe you are changing your bboxes in other parts of your implementation.

@st235
Copy link
Author

st235 commented Nov 14, 2023

Hello @SajjadPSavoji

Thank you for the provided feedback. I really appreciate it.

As far as I know W = x2 - x1 and H = y2 -y1.
According to that x2(right) = x1(Let) + W not "x2(right) = x1(Let) + W -1".
Maybe you are changing your bboxes in other parts of your implementation.

Check out the places I was highlighting in the original comment. I agree, the problem seems a bit tricky.

There are a few inconsistencies around this place.

The original comment on the line 32 says

    bbox : array_like
        The bounding box in format (x, y, width, height).

I.e. if you have a one pixel box, let's say (0, 0, 1, 1), then it seems it is occupying the only pixel at the position 0,0.

Both ways of handling this box would be correct: either consider the right and bottom corners of it in inclusive manner or exclusive one.

Though, the lines 57 and 62 seem inconsistent to each other.

The viewport on line 62 is considered originally as (0, 0, w, h). Right?

    # clip at image boundaries
    bbox[:2] = np.maximum(0, bbox[:2])
    bbox[2:] = np.minimum(np.asarray(image.shape[:2][::-1]) - 1, bbox[2:])

And the box is clipped by (0, 0, w - 1, h - 1).

At line 57 we don't do the same thing.

# convert to top left, bottom right
bbox[2:] += bbox[:2]
bbox = bbox.astype(np.int)

Imagine we have a bounding box (0, 0, viewport_width, viewport_height). Consider those are the width and height of the bounding box (from the pydoc) we trim them at line 62, and the box becomes (0, 0, viewport_width - 1, viewport_height - 1). At the same time, there is nothing wrong with the box having the same width and height as the viewport as this box will be the box with the size of the whole image and it seems it should not be clipped in that case.

As I said before, the inconsistency seems rather minor. However, to fix the issue, we can do a different change, for example, instead of changing boxes handling we can change the viewport handling in here.

Something similar to this

    bbox[2:] = np.minimum(np.asarray(image.shape[:2][::-1]), bbox[2:])

Hopefully, the explanation seems reasonable. What do you think?

P.S.: regardless of the way to fix the issue, I think without receiving the initial maintainers feedback there is nothing we can do about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants